Modal Clustering in Univariate, Conjugate Dirichlet Process Mixture Models
نویسنده
چکیده
The Dirichlet Process mixture (DPM) model is a popular nonparametric Bayesian tool for modeling unknown distributions through mixtures of components. Integrating out the latent location variables in a DPM model leads to a product partition model. This paper describes a modefinding algorithm which quickly finds either the maximizer of the partition posterior or the maximizer of the partition likelihood in a class of product partition models. Sufficient conditions when the algorithm can be used are given and these conditions are shown to be satisfied in several univariate, conjugate DPM models. Clustering observations based on a modal partition is a modelbased clustering procedure which, prior to this work, could only be approximated. The proposed procedure is demonstrated in a microarray dataset of expression levels of more than 10,000 genes.
منابع مشابه
Modal Clustering in a Univariate Class of Product Partition Models
This paper presents an algorithm for finding the maximum a posteriori (MAP) clustering in a class of univariate product partition models. While the number of possible clusterings of n observations grows according to the Bell exponential number, the dynamic programming algorithm presented here exploits properties of the model to provide an O(n2) search. Hence, the algorithm can be used to find t...
متن کاملBayesian Order-Adaptive Clustering for Video Segmentation
Video segmentation requires the partitioning of a series of images into groups that are both spatially coherent and smooth along the time axis. We formulate segmentation as a Bayesian clustering problem. Context information is propagated over time by a conjugate structure. The level of segment resolution is controlled by a Dirichlet process prior. Our contributions include a conjugate nonparame...
متن کاملSmooth Image Segmentation by Nonparametric Bayesian Inference
A nonparametric Bayesian model for histogram clustering is proposed to automatically determine the number of segments when Markov Random Field constraints enforce smooth class assignments. The nonparametric nature of this model is implemented by a Dirichlet process prior to control the number of clusters. The resulting posterior can be sampled by a modification of a conjugate-case sampling algo...
متن کاملModel-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
This chapter describes a clustering procedure for microarray expression data based on a well-defined statistical model, specifically, a conjugate Dirichlet process mixture model. The clustering algorithm groups genes whose latent variables governing expression are equal, that is, genes belonging to the same mixture component. The model is fit with Markov chain Monte Carlo and the computational ...
متن کاملSequentially-Allocated Merge-Split Sampler for Conjugate and Nonconjugate Dirichlet Process Mixture Models
This paper proposes a new efficient merge-split sampler for both conjugate and nonconjugate Dirichlet process mixture (DPM) models. These Bayesian nonparametric models are usually fit usingMarkov chain Monte Carlo (MCMC) or sequential importance sampling (SIS). The latest generation of Gibbs and Gibbs-like samplers for both conjugate and nonconjugate DPM models effectively update the model para...
متن کامل